Repetition-free longest common subsequence of random sequences

نویسندگان

  • Marcos A. Kiwi
  • Cristina G. Fernandes
چکیده

A repetition free Longest Common Subsequence (LCS) of two sequences x and y is an LCS of x and y where each symbol may appear at most once. Let R denote the length of a repetition free LCS of two sequences of n symbols each one chosen randomly, uniformly, and independently over a k-ary alphabet. We study the asymptotic, in n and k, behavior of R and establish that there are three distinct regimes, depending on the relative speed of growth of n and k. For each regime we establish the limiting behavior of R. In fact, we do more, since we actually establish tail bounds for large deviations of R from its limiting behavior. Our study is motivated by the so called exemplar model proposed by Sankoff (1999) and the related similarity measure introduced by Adi et al. (2007). A natural question that arises in this context, which as we show is related to long standing open problems in the area of probabilistic combinatorics, is to understand the asymptotic, in n and k, behavior of parameter R.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Repetition-free longest common subsequence

We study the problem of, given two sequences x and y over a finite alphabet, finding a repetitionfree longest common subsequence of x and y. We show several algorithmic results, a complexity result, and we describe a preliminary experimental study based on the proposed algorithms.

متن کامل

Finding Longest Common Increasing Subsequence for Two Different Scenarios of Non-random Input Sequences

By reviewing Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS), the Longest Common Increasing Subsequence (LCIS) problem is explored for two non-random input cases in details. Specifically, we designed two algorithms, one solving the input sequence scenario with the case that one sequence is ordered and duplicate elements are allowed in each of sequences, and the second ...

متن کامل

On the parameterized complexity of the repetition free longest common subsequence problem

Finding the longest subsequence present in two different strings is known as the Longest Common Subsequence (LCS) problem. It has been widely used as a measure to compare strings in different fields [2], in particular for the comparison of two (or more) genomes in Bioinformatics. Genomes are usually viewed as strings, where each symbol represents a gene, and the comparison of the strings associ...

متن کامل

A Comprehensive Comparison of Metaheuristics for the Repetition-Free Longest Common Subsequence Problem

This paper deals with an NP-hard string problem from the bio-informatics field: the repetition-free longest common subsequence problem. This problem has enjoyed an increasing interest in recent years, which has resulted in the application of several pure as well as hybrid metaheuristics. However, the literature lacks a comprehensive comparison between those approaches. Moreover, it has been sho...

متن کامل

A Comprehensive Comparison of Metaheuristics for the Repetition-Free Longest Common Subsequence Problem

This paper deals with an NP-hard string problem from the bio-informatics field: the repetition-free longest common subsequence problem. This problem has enjoyed an increasing interest in recent years, which has resulted in the application of several pure as well as hybrid metaheuristics. However, the literature lacks a comprehensive comparison between those approaches. Moreover, it has been sho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Discrete Applied Mathematics

دوره 210  شماره 

صفحات  -

تاریخ انتشار 2016